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ABSTRACT 

We have compiled MultitaskProtDB, available online 
at http://wallace.uab.es/multitask, to provide a re- 
pository where the many multitasking proteins 
found in the literature can be stored. Multitasking 
or moonlighting is the capability of some proteins 
to execute two or more biological functions. 
Usually, multitasking proteins are experimentally 
revealed by serendipity. This ability of proteins to 
perform multitasking functions helps us to under- 
stand one of the ways used by cells to perform 
many complex functions with a limited number of 
genes. Even so, the study of this phenomenon is 
complex because, among other things, there is no 
database of moonlighting proteins. The existence of 
such a tool facilitates the collection and dissemin- 
ation of these important data. This work reports the 
database, MultitaskProtDB, which is designed as a 
friendly user web page containing >288 multitasking 
proteins with their NCBI and UniProt accession 
numbers, canonical and additional biological func- 
tions, monomeric/oligomeric states, PDB codes 
when available and bibliographic references. This 
database also serves to gain insight into some char- 
acteristics of multitasking proteins such as 
frequencies of the different pairs of functions, 
phylogenetic conservation and so forth. 

INTRODUCTION 

Multitasking or moonlighting refers to those proteins pre- 
senting two or more functions performed by a single poly- 
peptide chain. They were initially reported by Wistow and 
Piatigorsky in the late 1980s when lens crystalhns turned 
out to be the previously known metabolic enzymes (1,2). 



The term 'moonhghting' was coined by Constance Jeffery 
(3), whereas Joran Piatigorsky proposed 'gene sharing' (4). 
Multitasking proteins present alternative functions that 
are mostly related to cellular localization, cell type, 
oligomeric state, concentration of cellular hgands, sub- 
strates, cofactors, products or post-translational modifica- 
tions (3-12). In many cases, a protein uses a combination 
of these mechanisms to switch between functions. 
Although some findings suggest involvement of a protein 
in extra functions, i.e. multitasking proteins can be found 
in different cellular localizations or in amounts exceeding 
those required for their canonical function; usually multi- 
tasking proteins are experimentally revealed by serendip- 
ity. Therefore, any alternative method to identify these 
proteins would be valuable. In previous works, we have 
explored the possibiHty of identifying multitasking 
proteins using bioinformatics approaches (13) and 
protein interact omics database information (14). Some 
authors have suggested that there is a relationship 
between protein conformational fluctuations and promis- 
cuous functions of proteins, whereas some structurally dis- 
ordered regions involved in their interaction with different 
partners are crucial (15,16). Nevertheless, although there 
are examples of multitasking proteins belonging to the 
Intrinsically Disordered Protein Class (i.e. p53), in a 
recent work we found that multitasking proteins are not 
more prone to belong to the Intrinsically Disordered 
Proteins (IDP) class than the average (17). 

During the development of our previous work aimed at 
trying to find bioinformatics approaches to predict multi- 
tasking proteins, we encountered the difficulty of collect- 
ing examples of such proteins because of the lack of a 
broad database, so the effort to gather the examples was 
often one of the main challenges. To facilitate the work to 
researchers interested in the field, we decided to make our 
set of multitasking proteins freely available as a web 
database. To our knowledge, a database of multitasking 
proteins has not yet been compiled. On an extensive data 
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mining, we have found ~288 proteins elsewiiere reported 
as multitasking proteins, with which we have made a 
database, named MultitaskProtDB, and designed the cor- 
responding web interface http://wallace.uab.es/multitask/. 
The database contains information and direct hnks to all 
these proteins as well as their accession numbers, species 
to which they belong, canonical and additional biological 
functions, PDB codes, if available and the corresponding 
publications (18-20). Even though the different functions 
have been called in our database 'canonical' and moon- 
hghting, this does not involve any biological relevance and 
merely reflects the historical order of their biological 
function discovered. The question of which was the first 
function and which one was lately acquired could be es- 
tablished by evolutionary comparative analysis and our 
database may help to perform these studies. Probably 
there are examples of multitasking cases hidden in the 
hterature in which the authors have not recognized this 
phenomenon or have not bothered to assign their proteins. 

MATERIALS AND METHODS 

Sources of the database 

In addition to the examples extracted from the small 
number of reviews about multitasking proteins (3-12), 
we have collected >288 multitasking proteins from an in- 
spection of the NCBI PubMed server (19). The literature 
mining has been performed using the following terms and 
key words: moonlight proteins; moonlighting proteins; 
multitask protein; multitasking proteins; moonhght 
enzymes; moonlighting enzymes; and gene sharing. A 
number of examples have been found by serendipity 
from some reviews on protein function, bibliography of 
sequenced genomes and so forth. 

Design of the database 

The database has been created using MySQL. The 
webserver has been designed with the PHP programming 
language and assisted by PHPRunner, an application that 
helps to generate PHP code and to create reports, hsts and 
forms facilitating the development of the important parts 
of the web. These reports can also be generated using an 
advanced search engine to allow a more accurate or re- 
stricted search. This kind of procedure serves to narrow 
the search to the subset of proteins to which one really 
wants to focus the study. 

RESULTS 

On opening the database web page a large table contain- 
ing 288 entries of multitasking proteins is shown (See 
Figure 1). It is divided into 15 pages, a maximum of 20 
entries for each page, with information on all the collected 
multitasking proteins. There are 12 columns in the table to 
characterize each protein. From left to right shows the 
following: column 1 is a clickable button to see the 
complete record details. Column 2 allows for entry selec- 
tion to export and manipulate its contents, if required. 
Column 3 (ID) indicates the correlative number of the 
entry in the table. Columns 4 (NCBI Code) and 5 



(UniProt Code) show the NCBI and UniProt accession 
numbers, respectively, which are hnked to the correspond- 
ing databases information (19,20). Column 6 (Protein 
Name) displays the protein name and the corresponding 
Enzyme Commission (EC) number (21). Columns 7 
(Canonical Function) and 8 (Moonhghting Function) 
show the canonical and moonlighting functions, respect- 
ively. Column 9 (Organism) indicates the organism in 
which the moonhghting protein has been identified. 
Column 10 (PDB) hnks to the PDB 3D structure of the 
protein, if available (18). Column 11 (Ohgomeric State) 
indicates the ohgomeric depend state of the protein 
when reported. There are proteins whose multitasking 
function depends whether they are in mono or ohgomeric 
state. This is the case for one of the major multitasking 
proteins, Glyceraldehyde 3-phosphate dehydrogenase 
(GAPDH). Column 12 (Reference) provides a hnk to 
the PubMed bibliographic reference (19). Some display, 
print and search facihties are provided by the web page. 
Moreover, export of the whole database or the selected 
entries can be easily done by obtaining a file in different 
data formats as required by the user for further analysis, 
such as Excel, Word, Comma Separated Values (CSV) or 
extensible mark-up language. The database is accessible at 
http : / /Wallace . uab . es/multitask/. 

An overview of the database shows that most disclosed 
moonhghting proteins present two biological functions. 
As could be expected, most pairs of functions correspond 
to different cell compartments when deaUng with eukary- 
otic proteins. When the canonical and the moonlight func- 
tions are considered [as broad Gene Ontology descriptors 
(22), i.e. 'enzyme and transcription factor; enzyme and cell 
adhesion'] from the database 30 pairs can be found. The 
most prevalent pair is 'enzyme-nucleic acid binding 
protein' — 74 of 288 moonlighting proteins — including in 
this class transcription factors and nucleic acid binding 
proteins. Another finding is the lack of integral 
membrane proteins, which is logical because multitasking 
proteins usually have each function in different cellular 
compartments, leading to problems for membrane 
proteins. Nevertheless, the second prevalent pair-of-func- 
tions correspond to an 'enzyme-adhesion protein' of 
pathogen microorganisms (48 of 288 moonlighting 
proteins). It is a well-known fact that many pathogens 
use metabolic enzymes that are not integral membrane 
proteins as adhesion elements to host proteins that 
require the membrane localization through different 
mechanisms (23,24). Owing to the high number of cases 
reported from crystallin proteins the 'enzyme-structural 
protein' pairs are also abundant (30 of 288). 



DISCUSSION 

Although several short reviews on moonhghting proteins 
exist (3-12), they generally only report small number of 
examples, up to 30^0 at most. 

One of the most striking issues of the mammalian 
(human) genome is the low number of protein-coding 
genes. To date, the main molecular mechanism used to 
increase the number of protein isoforms and functions is 
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Figure 1. A screenshot of MultitaskProtDB page. Currently, the database contains information ~288 multitaslcing proteins that can be easily viewed 
with the search button and other display facilities. There are several protein characteristics of some multitasking proteins that are not present in the 
database because no data have been found, especially for PDB structure or oligomeric state. The last column, 'Reference', links to the NCBI 
PubMed article. 



alternative splicing. However, a less known way to 
increase the number of protein functions is the existence 
of multifunctional, multitasking or 'moonUghting' pro- 
teins. Contrary to splicing, multitasking can be used by 
microorganisms. For example, a minimal cell like the 
genera Mollicutes or Mycoplasmas (which is an experi- 
mental objective of the authors too) seems to make exten- 
sive use of moonhghting (25,26). We have previously 
reported that the protein HsdS from Mycoplasma 
genitalium, which was annotated as the DNA binding 
subunit of the restriction system, is also a cytoskeletal 
protein (27). As stated by Jeffery (9), current moonhghting 
proteins 'appear to be only the tip of the iceberg'. 

Predicting multitasking proteins will be useful for re- 
searchers when designing a knockout experiment 
because it could have an off target or side effect with 
some hidden phenotypic traits. In previous work, we 
have suggested bioinformatics methods to predict 
protein multifunctionality (13,14). The MultitaskProtDB 
database will help researchers to identify protein charac- 
teristics and group them to gain insight into protein bio- 
logical function. 

Updates of the database are planned to be done peri- 
odically by adding new multitasking proteins as they 
appear in the hterature. These data could help bioinfor- 
matics identification of the multitasking proteins and serve 
as a source of data to create models or vahdate hypothesis 



about these proteins. We also wish to ask for the collab- 
oration of those researchers who are involved in these 
proteins and want to include their published examples. 
If his/her protein is not hsted in the database and they 
want to include it, please send us an email indicating the 
specific content they want to appear in each field of the 
table and the reference. 

Another interesting question is the possibihty of some 
multitasking proteins to have more than two different 
functions and to be hubs in protein-protein interaction 
networks. A preliminary analysis of a smaller set of multi- 
tasking proteins carried out in our laboratory (14) showed 
that a number of them would correspond to hubs, espe- 
cially those involved in energy metabohsm. In fact, from 
interactomics it is known than the complexes with more 
edges (connections) correspond to those of energy metab- 
ohsm and protein synthesis. However, we have not yet 
extended the analysis to the present database. In general, 
moonhghting is also important for the molecular basis of 
diseases and also for drug discovery because this phenom- 
enon is involved in drug targeting, pharmacodynamics, 
drug side effects and drug toxicology (28-31). 
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